Acceleration and Averaging in Stochastic Descent Dynamics

نویسندگان

  • Walid Krichene
  • Peter L. Bartlett
چکیده

[1] Nemirovski and Yudin. Problems Complexity and Method Efficiency in Optimization. Wiley-Interscience series in discrete mathematics. Wiley, 1983. [2] W. Krichene, A. Bayen and P. Bartlett. Accelerated Mirror Descent in Continuous and Discrete Time. NIPS 2015. [3] W. Su, S. Boyd and E. Candes. A differential equation for modeling Nesterov's accelerated gradient method: theory and insights. NIPS 2014. Deterministic ( ) Stochastic

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Stochastic Subgradient Mirror-Descent Algorithm with Weighted Averaging

This paper considers stochastic subgradient mirror-descent method for solving constrained convex minimization problems. In particular, a stochastic subgradient mirror-descent method with weighted iterate-averaging is investigated and its per-iterate convergence rate is analyzed. The novel part of the approach is in the choice of weights that are used to construct the averages. Through the use o...

متن کامل

Stochastic averaging for SDEs with Hopf Drift and polynomial diffusion coefficients

It is known that a stochastic differential equation (SDE) induces two probabilistic objects, namely a difusion process and a stochastic flow. While the diffusion process is determined by the innitesimal mean and variance given by the coefficients of the SDE, this is not the case for the stochastic flow induced by the SDE. In order to characterize the stochastic flow uniquely the innitesimal cov...

متن کامل

Stochastic Proximal Gradient Descent with Acceleration Techniques

Proximal gradient descent (PGD) and stochastic proximal gradient descent (SPGD) are popular methods for solving regularized risk minimization problems in machine learning and statistics. In this paper, we propose and analyze an accelerated variant of these methods in the mini-batch setting. This method incorporates two acceleration techniques: one is Nesterov’s acceleration method, and the othe...

متن کامل

DeepSpark: Spark-Based Deep Learning Supporting Asynchronous Updates and Caffe Compatibility

The increasing complexity of deep neural networks (DNNs) has made it challenging to exploit existing large-scale data process pipelines for handling massive data and parameters involved in DNN training. Distributed computing platforms and GPGPU-based acceleration provide a mainstream solution to this computational challenge. In this paper, we propose DeepSpark, a distributed and parallel deep l...

متن کامل

Iterate averaging as regularization for stochastic gradient descent

We propose and analyze a variant of the classic Polyak-Ruppert averaging scheme, broadly used in stochastic gradient methods. Rather than a uniform average of the iterates, we consider a weighted average, with weights decaying in a geometric fashion. In the context of linear least squares regression, we show that this averaging scheme has a the same regularizing effect, and indeed is asymptotic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017